lattice point
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Maine > Cumberland County > Brunswick (0.04)
- (6 more...)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
- Europe > France (0.04)
- North America > United States > Rhode Island > Providence County > Providence (0.04)
- North America > United States > New Jersey > Mercer County > Princeton (0.04)
- North America > United States > Maine > Cumberland County > Brunswick (0.04)
- (6 more...)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > Canada (0.04)
- Europe > France (0.04)
The Lattice Geometry of Neural Network Quantization -- A Short Equivalence Proof of GPTQ and Babai's algorithm
We explain how data-driven quantization of a linear unit in a neural network corresponds to solving the closest vector problem for a certain lattice generated by input data. We prove that the GPTQ algorithm is equivalent to Babai's well-known nearest-plane algorithm. We furthermore provide geometric intuition for both algorithms. Lastly, we note the consequences of these results, in particular hinting at the possibility for using lattice basis reduction for better quantization.
Non-Asymptotic Length Generalization
Chen, Thomas, Ma, Tengyu, Li, Zhiyuan
Length generalization is the ability of a learning algorithm to learn a hypothesis which generalizes to longer inputs than the inputs in the training set. In this paper, we provide provable guarantees of length generalization for various classes of functions in an idealized setting. First, we formalize the framework of non-asymptotic length generalization, which requires a computable upper bound for the minimum input length that guarantees length generalization, as a function of the complexity of ground-truth function under some given complexity measure. We refer to this minimum input length to length generalize as length complexity. We show the Minimum-Complexity Interpolator learning algorithm achieves optimal length complexity. We further show that whether a function class admits non-asymptotic length generalization is equivalent to the decidability of its language equivalence problem, which implies that there is no computable upper bound for the length complexity of Context-Free Grammars. On the positive side, we show that the length complexity of Deterministic Finite Automata is $2n - 2$ where $n$ is the number of states of the ground-truth automaton. Our main results are upper bounds of length complexity for a subset of a transformer-related function class called C-RASP (Yang & Chiang, 2024). We show that the length complexity of 1-layer C-RASP functions is $O(T^2)$ when the ground-truth function has precision $T$, and that the length complexity of 2-layer C-RASP functions is $O(T^{O(K)})$ when the ground-truth function has precision $T$ and $K$ heads.
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
A Centralized Planning and Distributed Execution Method for Shape Filling with Homogeneous Mobile Robots
Liu, Shuqing, Su, Rong, Johansson, Karl H.
The pattern formation task is commonly seen in a multi-robot system. In this paper, we study the problem of forming complex shapes with functionally limited mobile robots, which have to rely on other robots to precisely locate themselves. The goal is to decide whether a given shape can be filled by a given set of robots; in case the answer is yes, to complete a shape formation process as fast as possible with a minimum amount of communication. Traditional approaches either require global coordinates for each robot or are prone to failure when attempting to form complex shapes beyond the capability of given approaches - the latter calls for a decision procedure that can tell whether a target shape can be formed before the actual shape-forming process starts. In this paper, we develop a method that does not require global coordinate information during the execution process and can effectively decide whether it is feasible to form the desired shape. The latter is achieved via a planning procedure that is capable of handling a variety of complex shapes, in particular, those with holes, and assigning a simple piece of scheduling information to each robot, facilitating subsequent distributed execution, which does not rely on the coordinates of all robots but only those of neighboring ones. The effectiveness of our shape-forming approach is vividly illustrated in several simulation case studies.
- Asia > Singapore (0.04)
- North America > United States > California > Monterey County > Pacific Grove (0.04)
- Europe > Sweden > Stockholm > Stockholm (0.04)
Data Compression using Rank-1 Lattices for Parameter Estimation in Machine Learning
Gnewuch, Michael, Harsha, Kumar, Wnuk, Marcin
The mean squared error and regularized versions of it are standard loss functions in supervised machine learning. However, calculating these losses for large data sets can be computationally demanding. Modifying an approach of J. Dick and M. Feischl [Journal of Complexity 67 (2021)], we present algorithms to reduce extensive data sets to a smaller size using rank-1 lattices. Rank-1 lattices are quasi-Monte Carlo (QMC) point sets that are, if carefully chosen, well-distributed in a multidimensional unit cube. The compression strategy in the preprocessing step assigns every lattice point a pair of weights depending on the original data and responses, representing its relative importance. As a result, the compressed data makes iterative loss calculations in optimization steps much faster. We analyze the errors of our QMC data compression algorithms and the cost of the preprocessing step for functions whose Fourier coefficients decay sufficiently fast so that they lie in certain Wiener algebras or Korobov spaces. In particular, we prove that our approach can lead to arbitrary high convergence rates as long as the functions are sufficiently smooth.
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- Europe > Germany (0.04)
- Workflow (0.68)
- Research Report (0.50)
Compute-Update Federated Learning: A Lattice Coding Approach
Azimi-Abarghouyi, Seyed Mohammad, Varshney, Lav R.
This paper introduces a federated learning framework that enables over-the-air computation via digital communications, using a new joint source-channel coding scheme. Without relying on channel state information at devices, this scheme employs lattice codes to both quantize model parameters and exploit interference from the devices. We propose a novel receiver structure at the server, designed to reliably decode an integer combination of the quantized model parameters as a lattice point for the purpose of aggregation. We present a mathematical approach to derive a convergence bound for the proposed scheme and offer design remarks. In this context, we suggest an aggregation metric and a corresponding algorithm to determine effective integer coefficients for the aggregation in each communication round. Our results illustrate that, regardless of channel dynamics and data heterogeneity, our scheme consistently delivers superior learning accuracy across various parameters and markedly surpasses other over-the-air methodologies.
- Europe > Sweden > Stockholm > Stockholm (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- Europe > Russia > Northwestern Federal District > Leningrad Oblast > Saint Petersburg (0.04)
- (5 more...)
DotaMath: Decomposition of Thought with Code Assistance and Self-correction for Mathematical Reasoning
Li, Chengpeng, Dong, Guanting, Xue, Mingfeng, Peng, Ru, Wang, Xiang, Liu, Dayiheng
Large language models (LLMs) have made impressive progress in handling simple math problems, yet they still struggle with more challenging and complex mathematical tasks. In this paper, we introduce a series of LLMs that employs the Decomposition of thought with code assistance and self-correction for mathematical reasoning, dubbed as DotaMath. DotaMath models tackle complex mathematical tasks by decomposing them into simpler logical subtasks, leveraging code to solve these subtasks, obtaining fine-grained feedback from the code interpreter, and engaging in self-reflection and correction. By annotating diverse interactive tool-use trajectories and employing query evolution on GSM8K and MATH datasets, we generate an instruction fine-tuning dataset called DotaMathQA with 574K query-response pairs. We train a series of base LLMs using imitation learning on DotaMathQA, resulting in DotaMath models that achieve remarkable performance compared to open-source LLMs across various in-domain and out-of-domain benchmarks. Notably, DotaMath-deepseek-7B showcases an outstanding performance of 64.8% on the competitive MATH dataset and 86.7% on GSM8K. Besides, DotaMath-deepseek-7B maintains strong competitiveness on a series of in-domain and out-of-domain benchmarks (Avg. 80.1%). Looking forward, we anticipate that the DotaMath paradigm will open new pathways for addressing intricate mathematical problems. Our code is publicly available at https://github.com/ChengpengLi1003/DotaMath.
- North America > United States (0.07)
- Asia > India (0.06)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Asia > China (0.04)